METHOD FOR ADAPTING QUANTIZATION IN VIDEO CODING USING 
FACE DETECTION ANoVlSUAL ECCENTRICITY WEIGHTING 
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pending patent appli 
No. 60/071,099, f ile^Jdnuary 
BACKGROUND OF THE INVENTION 



encoding facial regions o:f\a video that incorporates a model 
of the human visual system encode frames in a manner to 
provide a substantially unifor^m apparent quality. 

In many*"'syst'ems the number of bits available for 
encoding a viaeo, consisting^ of >a plurality of frames, is 
fixed by the ^bandwidth ava^lafe%Vn the system. • Typically 



encoding systems use "ah ad^h.oC control technique to select 
quantization parameters^t'ha^ will produce a target number of 

bits for the video whi'le simultaneously attempting to encode 

f \ X\ 

the video frames with\the highest possible quality. For 

example, in digital video recoiling, elAcjroup of frames must 
occupy the same number of brts^^^r an efficient fast- 
forward/fast -rewind capability - : *In videotelephones , the 
channel rate, communication delay, and theV size of the 
encoder buffer determines the number of available bits for a 



frame . 



v 



There are numerous syst^^s^hat address the 



problem of how to encode video to achieve high\quality while 
controlling the number of bits used. The systems are 



usually known as rate, quantizer, or ^bu^g^con^t] 
techniques and can be generally classified tn&£ 
classes. 

The first class are systems that erfpode r ea*<$h block 



-fctraree major 



r^oae ea^n . 

of the image several times with a set^o-f— dlf.f.^:eTTtr^aA 
quantization factors, measure the number of biJts produced 
for each quantization factor, and then attempt to selecte a 
quantization factor for each block so thatche total numl&er 
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of bits for all the blocks total a target number. While 
generally accurate, such a technique is not suitable for 
real-time encoding sysuems because of its high computational 
complexity. \ 

The second cla^s are systems that measure the 
number of bits used in previously encoded image blocks, 
buffer fullness, block activity, and use all these measures 
to select a quantization factor for each block of the image. 
Such techniques are popular] Vor real-time encoding systems 
because of thei'f^^low ^^otnp^utjat^Lonal complexity. 
Unfortunately, \such technitfted are quite inaccurate and must 
be combined witn^addition^l tfeqhniques to avoid bit or 
buffer overflows and/unHei 

The third (class] are dy^tems that use a model to 
predict the number of\bits necessary for encoding each of 
the image blocks in terms^-£--^ quantization factor 

ate block variances. These 



and other simple pararoefceg^, such 
models are generally/based on mat 




lematical approximations or 



re ^omputationally simple 
sterns^, but unfortunately 
cquracies in the model 



\ 



predefined tables. 'Such 
and are suitable for 
they are highly sensiti^ 
itself 

Some rate contr&isvSys terns incorporate face 
detection. One of such sys^ems^al^ong wrth other systems 

below. 

581 , \discloses a low 
bit rate audio and video \pommuni cation system that 



that use face detection, i 
Zhou, U.S. Pat 



dynamically allocates bits 




audio 



information based upon the perceptual signif 
the audio and video information 
teleconferencing system Zhou 
quality can be improved by all v 
to encode the facial region of 

of the scene. In addition, Zhou sttggests that 
area, including the lips, jaw, and 




video 
ance of 

erceptual 
video bits 
remainder 
mouth 
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allocated more video toits than the remainder of the face 
because of the motion W these portions. In order to encode 
the face and mouth areays more accurately Zhou uses a 
subroutine that incorporates manual initialization of the 
position of each speaker Within a video screen. 
Unfortunately, the manual identification of the facial 
region is unacceptable f or ^automated systems. 



\ 



Kosemura et al . , IVS. Patent No. 5,187,574, 
disclose a system for automatvically adjusting the field of 
view of a television door phoWe in order to keep the head of 
a person centered in the imag4\frame. The detection system 
relies on detecting the top of t\ie person's head by 



comparing corresponding pixel 



number of pixels' are courtfe^d Along\ a horizontal line to 
determine the Iv^cation of th^ head.\ However, such a head 
detection techniqute-^is no£^ob\st . 

Sexton, U.S/Pat|ent No. 5,086,480, discloses a 
video image processing^system ip which\an encoder identifies 
the head of a person from^alhead-againsV- a -background scene. 
The system uses training sequences arid fits a minimum 



in\ successive images. The 



rec 



tangle to the candida,tfe^xel; 



\ 



ciil-c piAcib. jjne underlying 
identification techniqueoases vector/quantization. 
Unfortunately, the trainin^se^iuenp^s require the use of an 
anticipated image which will be^matched )to the actual image. 
Unfortunately, if the actua^in^ge in the sceiae does not 
sufficiently match any of tHe^raining sequences then the 

5 _ L°j 2/ 522 73 discloses a 

system for locating and identifydrfig^uTton faces i\ video 
scenes. A face finder module 



head will not be detected. 

Lambert, U.S. Patent No 




searches jfor facial 
characteristics, referred to a s sTgr^^^Sr^us/' 
template. In particular, the signatures _ 
the eye and nose/mouth. Unf ortunatelf : 7 :S such a te 
based technique is not robust to oc/' 
35 changes, and variations in the facia 
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Ueno et al A U.S. Patent No. 4,951,140, discloses 
a facial region detecting circuit that detects a face based 
on the difference between two frames of a video using a 
histogram based technique. The system allocates more bits 
to the facial region tha^ythe remaining region. However, 
such a histogram based teohnique may not necessarily detect 
the face in the presence ofv significant motion. 

Moghaddam et al . , \in a paper entitled "An 
Automatic System for Model -Based Coding of Faces," IEEE Data 
Compression Conference, March \1995, discloses a system for 



two-dimensional image encoding \pf human faces. The system 



uses eigen-templates for template) matching which is 
computationally intensive. 

Elef theriadis et al 



"Automatic Face Location— Deictic 
Assisted Coding of V:hdeo TelecS 



Bit -Rates, " Signa 



r vineo Te 
1 Procfeesi 



lp a paper entitled 

Aand Tracking for Model - 
erencing Sequences at Low 



ng: 



^e\Communication 7 (1995) , 
disclose a model -assisted cpdlla^P^e^hnique which exploits 



the face location information of vidteoXsequences to 

V t V \ 

selectively encode regions of the video Vto produce coded 
sequences in which the facial regoD^s ar^ clearer and 



sharper . 



In particular, the^s 
two frames of a video to detect motion 



initially differences 

-en the system 

attempts to locate the top of head of aXperson by 

searching for a sequential series ^ogTion-zer© horizontal 

Tn FIG 



pixels in the difference image, (as shown 
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Elef theriadis et al . A set of erisL.jjses^ with various sizes 
and aspect ratios having their uppe^most^partioi^ fixed at 
the potential location of the top of ^the Y^j^ are^ fitted to 
the image data. Unfortunately, scanning ^he^di^f f ^rence 
image for potential sequences of non---zerapixels is complex 
and time consuming. In addition, the ^yst^m—b&eghtXby 
Elef theriadis et al . includes many desi^gT^T^rameters\ that 
need to be selected for each particular system and v4<?Leo 




sequences making it\difficult to adapt the system for 
different types of video sequences and systems. 

Glenn, in a\^chapter entitled "Real-Time Display 
Systems, Present and Future, " from the book Visual Science 
5 Engineering, edited by O.H. Kelly, 1994, teaches a display 
system that varies the Resolution of the image from the 
center to the edge, in t^e hope that the decrease in 
resolution would lead to A bandwidth reduction. The 
resolution decrease is accomplished by discarding pixel 
10 information to blur the image. The presumption in Glenn is 
that the observer is looking Bt the center of the display. 
The attempt was unsuccessful because although it was found 
that the observer's eyes tended^ :o stay in the center one- 
quarter of the total image area/. the resolution at the edges 
15 of the image could notr^B^su^i<^i^ntly reduced before the 
resulting blur was detectable 

Browder et aI^_iBL^ pa^eV entitled "Eye-Slaved 
Area-Of -Interest Display SysfgmsT Demonstrated Feasible In 
The Laboratory," process video (sequences using gaze- 
20 contingent techniques. The^gaze- contingent processing is 

implemented by adaptively varying imaged "quality within each 
video field, such that image quSTity is maximal in the 
region most likely to be viewed while Being reduced in the 



periphery. This image quality ^e4^ction ^s accomplished by 
25 blurring the image or by introdi^ng^qu^irzation artifacts. 
The system includes an eye tracker with a computer graphic 
flight simulator. Two image sequenees^are^ created. One 
sequence has a narrow field of view ( 19 \ or 25 ^egrees) with 
high resolution and the other sequence hW^a^wide field of 



30 view (76 or 140 degrees) with low N^solutiion. Trie two image 
sequences are combined optically with^323^^ 
sequence enslaved to the visual systemVs^^^ 
center of gaze. To keep the boundaw between the\two 
regions from being distracting an arbitrary linear jrolling 

35 off (blending) from the high resolutiori s djiset imagjfe to the 
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low resolution image \is used. The use of an eye tracker in 
the system is unsuitable for inexpensive video telephones 

where such an eye tracker is not provided. In addition, the 

\ 

linear roll -off does not^ match the eye's sensitivity 

variation, resulting in either variable image quality, or 

\ 

unnecessary regions of high resolution. 

Stelmach et al.,^in a paper entitled "Processing 
Image Sequences Based On Eye Movements, " disclose a video 



the concept of varying the 



encoding system that employs 

visual sensitivity as a function of expected eye position. 

\ 

The expected eye position is generated by measuring a set of 
observers' eye m^eme'nt,s^to specific video sequences. Then 
the averaged eye movements 
observers. However, such 
the eye position wR^be-feu.ma^ not 
teleconferencing systems 



wre 



25 



calculated for the set of 
\ 

tern requires measurements of 
e\^vailable for inexpensive 
In adadfcaon, it is difficult, if 



not impossible, to extend the sysnem to an unknown image 
sequence thus requiring observer measurements for any image 
sequence the system is^oing\^o e^code^. Moreover, variation 
of the resolution is not afr^eJ^icienfe technique for 
bandwidth reduction. yj 

What is desired"; theref ore , \is\a video encoding 
system that automatically, locates facial regions within the 
video and encodes the video^sin a manner that provides a 
uniform quality of the video t^<K$^wer, 



SUMMARY OF THE PRESENT INVElSfg^ON 

The present invention o\re-2=eef&©^the \af orementioned 
drawbacks of the prior art by providing\a systejn for 
3 0 encoding video that detects the loperETonXof a fAcial region 
of a frame of the video. Sensitivity information is 
calculated for each of a plural ity^^ Joc^t ion s Mthin the 
video based upon the location of the facial regiori\ The 
frame is encoded in manner that provide©— a-^s-ubst antral ly 
35 uniform apparent quality of the plurality of locations to 
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the viewer when the viewer is observing the facial region of 
the video . Y 

In one embodiment, the detection of the facial 
region includes receiving^ a first frame and a subsequent 
frame of the video,, each of which including a plurality of 
pixels. A dif f erence\image is calculated representative of 
the difference between \ plurality of the pixels of the 



first frame and a plural 
frame 



)f the pixels of the subsequent 



A plurality of candidate facial regions are 
determined within the difference image, preferably based on 




mage in a spacial domain to a 
: candidate facial regions 
imaige to select one of the 



a transform of the 
parameter space . r 
are fitted to the < 
candidate facial 

In another ^embodiment\ t3^ie detection of the facial 
region includes f itting^th^^nddd^te facial regions to the 
difference image to select lone of the candidate facial 
reqions based on a combination of at\ least two of the 
following three factors inc^id^ng t a Vit factor 



representative of the fit ofJ the can\i Mate facial regions to 
the difference image, avocation factor representative of 
the location of the candidate facial / regions within the 

X / \ 

video, and a size factor representative <j>f the size of the 
candidate facial regions. 

In yet another embojlAment, the sensitivity 
information is calculated fior each of the plurality of 



locations within the video ba"sed~-ug£>* 



*b©th fehe location of 
the facial region within the video inVelation to the 
plurality of locations and a non -J*rffeai{ model\ of the 
sensitivity of a human visual 

In a further embodime 



^stem. 



35 



a target^ bi$\ value equal 
for lencodii 

is identified. The sensitivity infor mati on is 

n the sens: 

human visual system observing a particular region jbf the 



to a total number of bits available for (encoding the frame 

ilculated 

for each one of the blocks based upon the sensitiVity of a 



I 
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image. Quantization values for each of the multiple blocks 
are calculated to provide substantially uniform apparent 
quality of each of the blocks in the frame subject to a 
constraint that the total number of bits available for 
encoding the frame is equa^l to the target bit value. The 
blocks are encoded with the^ quantization values. 

The foregoing andXother objectives, features, and 



advantages of the invention \^ill be more readily understood 
upon consideration of the following detailed description of 
the invention, taken in conjunction with the accompanying 
drawings . ^ 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is a block diag::am^of an exemplary 
embodiment of a face detection module of the present 



invention. 



FIG. 2 is a: 




Xation weightings for 
of FIG. 1. 



FIG. 3 is an example^o~f7°radii\ limits of the face 
detection module of FIG. 1.^ | j 

FIG. 4 is an e x a mple ^o i ^ 1 e r s \^o f considered 
ellipses of the face detection module of \ FIG. 1. 

FIG. 5 is a block diag-raifeof an\ exemplary 



embodiment of a visual model of the present ^invention. 

FIG. 6 illustrates thes*relatioj/ship\ between the a 
distance on the display of a viewer's f ocus<> and the 



35 



resulting visual angle of the vie^e* 

FIG. 7 illustrates an eccentricity in x ^isual angle 
for each location as a function of tfre-^ ,£g:om the 

detected region boundary. 

FIG. 8 illustrates an eccentricity vershs\ location 
for a series of viewing distances. 

FIG. 9 illustrates a set of visual sensiti^ 
data sets for absolute sensitivity of tjrohuman visual 
system. 
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FIG. 10 illustrates the visual sensitivity as a 
function of pixel location. 

FIG. 11 illustrates the resulting cross section of 

\ 

sensitivity values for an elliptical object. 

FIG. 12 is an exemplary embodiment of a block 
diagram of a block-based image encoding system of the 
present invention. \ 

FIG. 13 illustrates Va set of quantization steps 
versus block number for one row of blocks in a frame . 

FIG. 14 is an exemplary block diagram of an 

encoder including the face detection module of FIG. 1, the 

\ 

visual model of FIG. 5, and the Block-based image encoding 




invention, 
lock diagram of a decoder 



system of FIG. 12, of the present 
FIG. 15 is an exemplary 
of the present invention. 



DETAILED DESCRIPTION 0! 

In very low 
systems , state-of -the- 



lEB^ EMBODIMENT 
Ldec\ teleconferencing 
ring techniques produce 
artifacts which are systematicaJr/^present throughout coded 



images . 



The number of artifacts increases with both 



IT 



increased motion between frames^and 



\ / 1 
id incr 



tased image texture. 



In addition, the artifacts usual^y^af f ecW all areas of the 
image without discrimination. ^However, ^i^wers will mostly 
notice those coding artifacts in^axeas of particular 
interest to them. In particular , a^vJ-ewer ©f a video 
teleconferencing system or video t.e-1-ephon"^ will typically 
focus his or her attention to the^face(s) of \the person(s) 



on the screen, rather than to areas su&hi^s^ c lathing or 
background. While fast motion may mask man^coding 



las the ability to lock 
sJ|ch a ^\\ person's 
Accordingly, communication between vi^Wer^ of very 



artifacts, the human visual syste 
on and track particular moving obj 
face 



low bit-rate video teleconferencing systems or 



35 telephones will be intelligible and pleasing only 



v W e< 
>nlyi w 



tfhen the 
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person's face and facial features are not plagued with an 
excessive amount of coding artifacts. 

Referring to FIG. 1, a face detection module 10 
receives frame i 12 and frame i+n 14, each consisting of a 
plurality of pixels. Frames i and i+n may be immediately 
successive frames or frames spaced apart by n frames . 
Frame i 12 is resized by* a scale image block 16 to reduce 
its number of pixels. The pixel reduction reduces the 



computational requirements of the system by narrowing the 
search space. Frame i+n 1)4 is likewise resized by a scale 



iri^t he- 





manner as the scale image block 



used by the scale image blocks 16 
and 18 is variable so that bhe resulting number of pixels 
may be selected to provide sufficient image detail. When 



initially detecting' a face within a sequence of images the 



scale factor is preferable t^licte (or any other suitable 
value) the scale factor luse^d in\subsequent calculations. 
Thus, when initially d^retxing\ a\ f ace the resulting image 
will include substantially more pVxels than during 



subsequent tracking^ This insuresv a good initial 

determination of thev^face location ^nd reduces the 

computational requiremenfes—o^ subsequent tracking 

/ \ 

calculations since the initial LocatUon is used as a 
starting location to narartSwthe search space. 

In many appli^cat^^ 
moving on a constant backgrou^d^s-uch \.s a video telephone, 
th e movement „o .f the... head may be\detected by ^ubtrac.tlng^Jbvro 
frames of the^v^^ T^he resulting non- 

zero values from the subt^cted frames may be representative 




erson v \A difference image 
images troty \the scale image 



of motion, such as the hea 
block 20 subtT T a"ct's^he~"sc< 
blocks 16 and 18 from one another to obtain a difference 

/ \ V — 

image21 . More particularly,! t ^ ie difference image 21 is 
obtained by: d i+n (k) =l i+n (k) -li (k) , where k ifs \the spatial 
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in the resized image frame, 1 is the 
scripts indicate the temporal location 



location of the pix< 
luminance, and the s 
of the image frame . 

A clean difference image block 22 attempts to 
remove undesirable non-zero pixels from the difference image 
21 and add desirable non\zero pixels to the difference image 
21. Initially, the clean^if f erence image block 22 performs 
a thresholding operation on j:he difference image 21: 

d i+n th (k) = 0;W +n (k)| <T 
1; f^i + n(k) | >T 

where T is a predefined threshold value and | . | denotes 
absolute value. The resulting\thresholded image is 
represented by 1 ■ s / /and 0 ' s .^NThereaf ter , morphological 
operations are performed on' the. nhresholded image which, for 
example count theSi^umbep^^f non^e^ro pixels in a plurality 
of regions of the image. If^J^te nqn-zero pixel count within 
a region is suf f iciently^/small then\^Lll the pixels within 
that region are set to zero, V tvh d -tat 



within the thresholded image. 



This removes scattered noise 
J 



Likewise, if the non-zero 
pixel count within a region is Jsuf f iciently large then all 
the pixels within that region are set V\ one. This enhances 
those regions of the image ind^cifEive ©f\motion. The 
overall effect of the morphological operations is that 
scattered ungrouped non-zero ^xel^/e sefe to zero and 
holes (indicated by zeros) in the grpupe - ^ n\n-zero regions 
are set to one. The output fron/the clean difference image 
block 22 is a cleaned image 23. 

Next, the facial regions ar.e<u3enti£ \ed within the 
cleaned image. A face generally has an e lJUrpti ^al shape so 



the face detection module 10 adopts 



T 



ellipse a^ a model to 
represent a face within the image. M-thoug^^eXiipper 




(hair) and lower (chin) areas in actual face outlines may 
have quite different curvatures, ellipse^- provide a^jood 
trade-off between model accuracy and parametric simplicity. 

Jmatei< 



35 Moreover, due to the fact that the elli 




;ion is 
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not actually used\to regenerate the face outline, a small 
lack of model -f it tVng accuracy does not have any significant 
impact on the overall performance of the coding process. 

An ellipsA of arbitrary size and "tilt" can be 
represented by the following quadratic, non-parametric 
equation (implicit foism) : 

ax 2 + 2bxy + >cy 2 + 2dx + 2ey + f = 0 

b2 - ac < 0 \ i 

To reduce the computational requirements, the 
system initially identifies the top portion of a person's 
head with a model of a circle with an arbitrary radius and 
center. The top of the head has a predictable circular 
shape for most people so i£V rov ides a consistent indicator 
for a person. DecisTon^^c A 2 5 branches to a select 
candidate circle^ block /4 isf \the initial determination of a 
face location for ^s-e^uence ofi video is necessary, such as 
a new video or scene of a^ideoA* The select candidate 
circles block 24 identifies candidate circles 27 as the top 
m peaks, where m is a^preset parameter, in an accumulator 
array of a Hough trans f fc ©rmjbf/the\ 



Lmage . 



transform for circles is.:. 

A(x c ,y c/ r) =A(x c/ y c/( r) +1 V x ( 
where A ( . , . , 

in the cleaned difference ima£re 



A suitable Hough 
y c/ |r \e (x-x c ) 2 + (y-y c ) 2 =r 2 



) is the aocumulatoryarray and (x,y) are pixels 

ihiV 



hi\ch exceed the 



threshold. The Hough trans 



n^orm maps th^ image into 
a parajnet^^^c^j^^dentify shapes tha^_ can be^ 
parame ter i zed . _ By mapping t ^""^C^^ nej^i^E f e r e nce_ : !J2^9_ e _to_ 



^ x 

parameter^. jipac.e. # , the ^ a ctual shapes c^rres^nding.. to. r the^ 

merely looking 



transform can be identified, in 




for a series of pixels in the/image 
accurately detect suitable curva4^u£es; 
by Elef theriadis . 

A score candidate circl'es block 
the candidate circles ^27, in par 
cleaned ^difference imagers to the 



ace^wfeich does not 




he\face, as taught 

ores each of 
e\ fit of . the . 
candidate 



10 



15 



20 



25 



30 



circle 27. The^fU 
candidate circle 2 
M c (k) 




13 

criterion used is as follows. If C is a 

then let M c be a mask such that, 
1; k inside or on C 
otherwise . 



5 A pixel k is on the cdrcle contour, denoted C i# if the pixel 



arcie 
i&rcle, 



is inside or on the circle, and at least one of the pixels 



in its (2L+1) x (2L+ 



1) \nei 



\ A 



ighborhood is not. A pixel k is on 



the circle border, denoted by C e , if the pixel is outside 
the circle, and at least^pne of the pixels in its (2L+1) x 
(2L+1) neighborhood is either inside or on the circle. The 
normalized average intensifies Ii and I e are defined: 
Ii=(l/|Ci|)Ed i+n th (WK where ke Ci 



and 



Ie=(l/|C e |)£d i+n tn (kj 



wjiere kec ie 

The measure of fit is then 



where |.| denote^ea^rdili^S-i^y . 
defined as: / 

R = (KlJ/(l + I e 
A large value of R irrddrdates a 



goQd fit of the data to the 
\ 

candidate circle 27. I ^contra st^,V a small value of R 

indicates a poor fit of/the iata tovthe candidate circle 27. 

I \ 1 \ 

While the respective va'lueXof R provides a 

\ / \ 

reasonable estimation of Che^approprrateness of the 

^ \\ 

respective candidate circle 2X, — the present inventors came 
to the realization that video teleconferencing devices have 
implicit characteristics that may be exploited to further 

In most 



.sually centrally 
"oreover, the size 



determine the appropriateness d^candi da taje circles 
video telephone applications the head i 
located in the upper third of the^-^l'mage 

of the face is usually within a \ange of siies and thus 
candidate circles that are exceedin^y^stnailXor excessively 
large are not suitable. Ac c o r d i ng 1 y , i n ^ add lfo i on to the fit 
data, the score candidate circles jolock 2 6, alsc^ examines the 
size and location of the circle. Refl^^^^^^ KIG . 2, the 
outer border region 40 of a display 38 is an unsuitable 



35 



location for a center of a candidate 



27 



%e central 
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upper third region 42 of the display 3 8 is a desirable 
location for the center of a candidate circle 27, and the 
remaining region 44^of the display 3 8 is acceptable. For 
example, the undesirable outer border region 40 may have a 
weighting factor of Oi. 25, the acceptable region 44 a 
weighting factor of 0.^5, and the desirable central upper 
third region 42 a weighting of 1.0. Referring to FIG. 3, 



the radii of candidate qipcles 27 likewise have a similar 

A candidate circle 27 with a 



distribution of suitability. 



radius less than the smal 
be given a weighting of 0 



radii 50 is undesirable and may 

% . A candidate circle 27 with a 
A 

radius between the small rfadii 50 and an intermediate radii 

ven a weighting of 0.6. The 



larqe candidate circle radii 53 



52 is desirable-^and may bf 
remainder of ttfe^jpossib] 



are undesirable and may be given a weighting of 0.2. Any 

c terns may be used for the radii 



other suitable weighting 
and locations. 

The three \paramdte 



suitability of a candidate,^ 
candidate circle to the cleaned i 



sed to determine the 



.e\c-ixcle\are the fit of the 




the candidate circle 1 s (center , 
circle's radii. 



candidate circles with the 




ge data, the location of 
andt the size of the candidate 
Any suitable ratip^f the three parameters 
may be used, such as (0 . 5) **§ijbjj[^25) *<center+ ( . 25) *size . The 

t-^sco^re are subsequently 
used as potential locations^of the fac^ in the image for 
matching with candidate ellipses to more accurately model 



the face. Using circles f or-th^tn^fe-ialV determination 
provides a fast computationally ef f ^cient^ technique for 
determining candidate face loca£>en 

After the initial c^ndidate\ circles are determined 
and scored, a generate candidate el^i.dses\ brock 28 generates 
a set of potential ellipses 2 9 to be matched V° the cleaned 
image 2 3 for each candidate circlejwifck a sirfidcient score. 
Ellipses with a center in the region around the center of 
the suitable candidate circle and axset of radii in the 
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M 15 



15 are 
, t\at of the respective candidate «~ ^ 
g eneral *° ^ ° co FIG. 4, the ^'^^ in 
considered. eUipse center 31 - 

elliP ses delude * S ^ ^ verCical direct ^ 

the rW"" 2 ""^ttcr^ection. "V," 

the center 33 ot , .„ the vertical ^ 

oJ candidate ell P= < candidate elliP^ , ncrease d 

U greater than the ,, The reason for the 

•«^nt-al direction, ^ , ■ because racet, 

tn e horizontal \\ direction b so 

variability in the \\ the vertical dir 

a « elliptical shape \ direction of tne 

to have an emp AAvertical dire 

^ variability m tjie\ be tter fit to 

increased van* ejlipse permits a 

of the candida-te_eJ£ v y tend no t to 

center 01 r xn/cbqtxast, . =H ilitV is 

-i face location- ^ variabilis 

the hori,ontaW" c "\V° on le che locati on o£ imtral 

re;:; « -—:^w — : ic r 

nu *er or candidate ^ ^ ^ 

computational requ ( cand>f\= ion 4 1 

preferably, a set X t ^ B wit"* 11 a reg 

,er are consideredN.^^" \ ^ lesg 

CirCl : 0 t e he circle center 33 

T and greater than the *L ge nerate 

^ "The candidate ^^^coXy the score 

.llipses block 28 ar> ^5=T— th \ candidate 
candidate e ^ fciock 32 which score* -~» 

. score 

Candid T 2 : X "a — fit ^J^rSUic^ -s k 
^'fdate circles bloc, 26, --f^Xore V ndida te 
candidate ci circula r on4- The « * locatio n 
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candidate ci cirCU lar on^- ™ e * \ loC ation 

is used inste ad of a ^^1^^^^ , if 
eUiPSeS Ui°pre and - radii- as addition^ pV^ 
of the elHP se <* /"^^ • hesA score iS 

desired. ellipse 39 wi£ h ^ ^ ! 4 \& the 

The candidate e P >^block *4 \t* 

»- 41 by an output top 
then output 41 oy 



Ptere provided by the 
. The parameters y 

f the sysW Ttie v 
remainder of ^ \ lock 34 are: 
output top candxdat loC 



10 



\lock 34 are: ^ cent er 

vAizontal locatxo center 

Vical location ot e 
vertical ellipse 
As radius of el HP 

y axr* radius of 
tilt A 

m eter is optical- tnr oughout the 

The tilt P arameter rnatlve is\p » se Cir ° f the system as 

of the circle- The P L ^catio of ^ 

center y [ _^.„ a ^fV^ 



center x 
center y 
radius x 
radius y 
angle 6 
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* Lrf^'M" oth er parameters o£ 

« diuS t o^S" St °A V Provided, i£ ^ siied ' 
" i may HWi^f \ e pr f itB respective 

circle may it- / \ \. t ive o£ its 

such as a dimeter ^ ^ ) \ V„„we trames, 
radius . 
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:h /is represeu^ 

TaL^%rrai r S^ 

t"ST^«i!^Sr^ thk^i» al f A from *e clean 
ETociriTdRSSnes t ^^Ste^a^V ellipB es 
^ te „ined and pas- * a^t^ndW- 
di££ e«nce image bio* ^jUp-. top 

* 36t 0t ' mpse" trc output 

candidate I™" elli pses « p**^ 6 w\hori« 



30 



35 



candidate ~ elli pBes is f^X h th\ horizontal 

Tn e set of ot cend^^ A reason to 

subs tantiallV ^ The re "j£*^\.lUP~ 

and vertical dire of the 9 enerat !^ b ecaA^- 

include the varxahil ^ tial ^,pg^^ 

locatio^n^t^^^ subse quent tracking-^ ^ y 
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the motion of the hJa(^t self where motion 
in either the_vertioal. and the horizontal directions, as 
opposed to a determination of where the head is; based on an 
inaccurate circular mbclel. The radii selected for the 
candidate ellipses (x a!nd y) are likewise in a range similar 
to the previous top candidate ellipse 41 because it is 
unlikely that the face 4s become substantially larger or 
substantially smaller between frames that are not 
significantly temporally different. Reducing the difference 
in variability between thelhorizontal and vertical 
directions reduces the computational requirements that would 
have been otherwise required^, 

The candidate ellipses from the select candidate 
ellipses block 30 are passjLd feo the score candidate ellipses 
block 32, as prevlio^Iy^described. The output top candidate 
block 34 output sN*h£_C£^ with the highest 

score. The result is that after, the initial determination 
of the face location the^auc^ defection module 10 tracks the 
location of the face with frhe vipeo. 

The face detection moduPe 10 may be extended to 
detect multiple faces .^I^^ch a case the output top 
c^a^id^ d output a si^t of parameters for each 

face detected 

Alternative face detectiq 

;Ke 



to determine the location b 





techniques may^^be^usied 



ce Within an image. In 



such a case the output of the fa ce de tection module 

A 



representative of the locati 
within a video. 



of the face and its size 




is 



\ 



If desired, a gaze dete^^Q^ module which detects 
the actual viewer's eye position may^be used to determine 
the location of the region of inter^i_Lo^the viewer, within 



> a -f 



ie . 



a video. This may or may not 

The present inventors^came tb theArealization that 
the human eye has a sensitivi ty to "^^ g^etk^^that^is.. 
dependant on the distance to the par.ferrctnar pWels of the 
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Hnear visualj^^^ sfcthft^— 
3^% he visual A^J^^^ 

determines the £ NL US 64 an 62 

ii2^ £ 4|%e^ « tJW fj^icipated ^"^ 
~a^« °* " 66 will depen* °* * , atio nshiP ^ 
Th e visual angle 66 w JJ \ lar relati ^ ^ 

listance „« * ^ o, i-* angulaI 

Rights, as *» * dista „ce and 

relationship x» " W vW9 * r 

sy ste. *— rV^,. WeAatjA' deter^ the 

particular display S det >^by V^ 9 din g the 

Relationship ^ * ether .^ft 1 " 
viewing Q \ Xe visual «del 60 

R , ^9 an eccent^ ' f the 
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where 0 e is the eccentricity in units of visual angle, y is 
the vertical pixel position in the image frame, and x is the 
horizontal pixel position. The following four parameters 
are the outputs from the\ f ace detection module 10: y c , c x , 
5 y r/ and x r/ where x c and y^ c are the (x,y) center positions of 
the selected ellipse in the frame, and x r and y r are the 
elliptical radii in the horizontal and vertical directions, 
respectively (i.e., the horizontal and vertical minor and 
major axes are 2x r and 2y r , respectively) . V is the viewing 

10 distance in the units of pixel distances, (e.g., in viewing 
an image with a height of 512\ lines of pixels with a viewing 
distance of 2 picture heights)* V=2*512=1024) . 

Referring to FIG. 8,\a graph of the eccentricity 
(in visual angle) for a single pixel location for a series 

15 of viewing distances, from 1 ima^ge height to 6 image 

heights, is shown. ^TlTe^view^ig ideation is the center of a 
64 0 by 4 80 pixel display. EbrNsxample, a viewer at a 



\ \ 

distance of 6 image heights 70 observes 6 degrees 
eccentricity in comparison tp__a larger 35 degrees of 
2 0 eccentricity at a distanceof fl image height 72 when looking 
at the edge of the display^ 8.1 I,t is? noted that x R and y R 
are both zero in FIG. 8. 

The visual angle of~fck© viewer to each pixel of 
the image is then used as a\basis ofj calculating, at block 
63, the viewer's sensitivity tq^each pixel or block based on 
a non- linear model of the human visual, system. Referring to 
FIG. 9 and the eccentricity caietTTation qf FIG. 8, a set of 
measured data sets 80 and 82 (W ^ual^dat a\ for absolute 
sensitivity of the human visual system^ is bbtained across 
all frequencies. The data sets 80^4Hid+8^aYe used to 
determine the maximum sensitivit^to thte frequency response 
of the human visual system. A Corti-cai^Magn^f ication 
Function (CMF) (shown below) fits the data well and provides 
data set 84, which is a function of HovTmany Vain cells are 
35 allocated to each visual field location. In eAsence, FIG. 9 
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illustrates a non-li\ear actual model of the sensitivity of 
the human visual system as a function of eccentricity. The 
sensitivity can be normalized for use in general rate 
control or an absolute Value where visually lossless quality 
is needed. Applying the\ sensitivity data of FIG. 9 to a 
pixel image results in A image of the same size as the 
original pixel image (or k a macro block sampled image) and 
gives the visual sensitivity as a function of pixel 
location, as shown in FIG. W The CMF equation governing 
data set 84 FIG. 10 is: 



where S is the visual sensitivity, K ECC is a constant 

) E \is the eccentricity in 



(preferred value is 0.24), and 6 

visual angle as given in the cab Ration. The CMF equation 
is referred to as the ^orHcT^MagnVf ication Function. The 
result is a sensitivit^imag^ oArip, that can be 
determined at any desired resolution^! th respect to the 
starting image sequence. The/SMF^eqUtion may also be 
applied to the image where tie vjewej: ^ observing any 



different sensitivity 
rre\d\embodiment , the 



sh 41 for the 



arbitrary location 90, resulting, 

values for the pixels. In the j3 

location 90 is the top candidate ellip 

particular frame ^ 

FIG. 11, illustrates the resulting cross section 

of the sensitivity values for an elHpt 
radius of 100, centered at position 

position 98 (dashed line) . It is alscTp^ibl\e to use the 
visual weighting of the image for multi 
other shapes) regions of importance 



icalVbject with a 
\ 

( sol idy line) and at 



It is\ 



Optical (or 
\ 

noted that the 
cross sectional region of the indicated H^eti^reaion is 

constant, namely 1. /^"^ 

It is to be understood that other non-l\Lfoear 

models based on the actual human visual syste 



stem mayi likewise 
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V. jHvitv information with pixels, 
be used to associate Wsitivity xn 
r-paionsVf an image. 

^Cuei V so produces sensit^ity image 

= » function of the location within the imag 
infor-ation as a ^ ^ The 

in relation " ^ \ „ to where i is the most 
values preferably range re has a 

senEltive . R eferrin g agai\ t ■ ^ pixel Qf 

T^tr r: aeo -e^ to he encoded h y an 

urtoo then - cular 

elected target number of bits\l a P typical 
system. The following description based 

Jock-based image encoder 100 guch as a 
that any other encoder may UkewiW be 

region or pixel based °»^HA block . base d image 

Referring to ^ ^ 

encoding system, such as NP83-1. ™\ ' ^ a 

the image (frame, « b * enC °f ^fame W*" 11 ' ° f 

plurality of image bloc, h q£ each bloc k are 
16X16 pixels per block. The / A 

transformed by a block t.™^^ £ 

coefficients, preferably by -"^^S are quantized 

mrTl T he resulting/coefficients 
Transform (DCT) . Th ^ encode i\by a coder 106. 

by a block quantizer 14 and th \ , coefficients 
The quantization of the tra F 
, hp aua iit Y of the encoding ofjj^ch image d 
determines the quality iip 4?Woc\ 101 is 

10 l. The quantization of the f 9 V e block 

controlled hy only one parameter, ^ T encoding 
quantizer 104. In the H 2«1 a ^1 step for 
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quantizer 104. in ^ d ^^^P for 

standards, Qi « referred ^^fW step 

the ith block and ^ ^^stoJL^L^ In 
size used for quantizing the trans co as 

the — I and the ^^^'J^ of W 

the quantization seal «* th ^ ^ ^ ^ 

quantized using a quantizer oi 
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the jth value of a quantization matrix selected by the 
designer of the MPEG codec. The H.261, H.2 63, MPEG-1, and 
MPEG- 2 standards are incorporated by reference herein. 

The number of bits produced when encoding the ith 
image block, B i7 is a function of the value of the 
quantization parameter Qi and the statistics of the block. 
If Qi is small, the image block is quantized more accurately 
and the image block quality\is higher, but such a fine 
quantization produces a large number of bits (large Bj for 
the image block. Coarser quantization (large Q ± ) produces a 
fewer number of bits (small b|\ but the image quality is 
also lower. \ 

In image coding, ths image blocks are said to be 
intracoded, or of class intra!. Vn video encoding, many of 
the blocks in a frame^re smiSl^A to corresponding blocks in 
previous frames. Vicfeo-sy^ems typically predict the value 
of the pixels in the currentbloc)k\f rom previously encoded 
blocks and only the difference or prediction error is 
encoded. Such predicted^blocks Jre \a±d to be intercoded, 
or of the class inter. The^teQhniques described herein are 
suitable for intra, inter, orbath iktVa and inter blocks 
encoding techniques. C I \ 

Referring to FIG. 13s, a set of quantization steps 



Qj versus block number j for one ^row ofi Blocks in a frame is 
shown. There are three dif f erent^dTcfeo cbding strategies 
discussed below. Each techniqup is f irst Nbrief ly discussed 
then the latter two are discusse^i^ detail. 

FIRST VIDEO CODING STRATEGY^ 
The first strategy is repr^s^fted by line 120, 
which uses the same quantization value Q for all the blocks 
in the row. This may be referred feo^a^jthe f iVed-Q method. 
The resulting number of bits to encode^ tie r>pw \of blocks is 
referred to as B. 
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SECOND ^IDEO CODING STRATEGY 
The second strategy is represented by the 
staircased line 122. Qj L set to Q for the block closest 
to the location where the Wstem has determined that the 
5 viewer is observing, such hs the face region. In FIG. 13, 
the viewing location is shown as the middle of the row. 
Qj 's are selected to be large^ than Q for blocks farther 
from the center. Since all the quantization steps are as 
large as or larger than those Vor the fixed-Q strategy 120, 
10 the staircased line 122 technique will encode the blocks in 
the row with fewer bits. The resulting number of bits 
3 necessary to encode the row of blocks using the staircased 

line 122 technique is referred to\as B', where B'<B. With 
U th e proper selection of Qj for eacft block the image quality 

m 15 will appear uniform to the human eVe. as described in detail 
fi below. Accordingly, the perceived Uuality of the encoded 

images using line 120 or line 122 wkll be the same, but, as 
H mentioned above, using line 122 will produce fewer bits. 

THIRD VIDEQ-CODINgTsTRATEGY 
If the quantizations^ ^f Vine 122 are reduced 
by a constant, the number of bits^ce^ssary to ^ ncode ^ he 



blocks will be greater than B'.<Qrh!e staircase line 124 
represents the steps Qj • used f orN^dina the blocks 
resulting in the same number bits/B^as tVie\ line 120. The 
blocks of the entire row will be Weiv d\by the viewer 
having the same image quality, with^h^oW selection of 
the QV values. The center is quanti^TwitA step size of 



as 



q-<q/ resulting in the image qualityCatthe cWer having a 
better quality than the fixed-Q technique He^ce, the 
30 perceived image quality to the viewer of^e^eAtire row, 
, . , .. ^-^--ioiw nn-iform. will be higher fehan the 



which is substantially uniform, will fcehigher t^h 
fixed-Q case, even though both techniqu^wifee-^ same 
number of bits B. The objective of the ataircakd line 124 
is to find the proper Qj • values automatical^. sb\that the 
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pre-selected target numbed of bits (in this case B) is 
achieved. 

DETAILS 0F\ SECOND STRATEGY 
The present invendor came to the realization that 
a coarser quantization on imkge blocks to which the viewer 
is less sensitive can be performed without affecting the 
perceived image quality. In kact, when encoding digital 
video, the quantization factor\can be increased according to 
the sensitivities of the human Visual system and thereby 
decrease the number of bits neceWry for each frame. In 
particular, if the entire N blocks of the image are 
quantized and encoded with quantisation steps 
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Q/S lf Q/S 2 , . • -Q/S N , 
respectively, where S k is the sensi 
kth block, the perceived quality of ^ 
be the same as if all the blocker 



Equation 1 
ivity associated to the 
the encoded frame will 
quantized with step 



Since the S k 



's are' small er 
ition 1 




than or equal to 1, the 
iXL be as large as or 



size Q. 

resulting quantizers in Eqaauony^W — — — 
larger than Q, and theref ore^^^okVce fewer bits when 

encoding a given frame. / I \ 

To summarize, the resW \of /such an encoding 
scheme where the sensitivities ar>V/pre,sentative of the 
perceived image quality based on ajngdel W the human visual 
system and varying the quantization^ act^ with respect to 
the sensitivity information, prov***^ image that has a 
perceived uniform quality. This a^rfTpWVles a minimum bit 
rate with the uniform quality. ^ 

The following steps may be~use^-to\reduce the 
number of bits for a video frame usincj^np^lected base 
quantization step size Q. ^ 
STEP 1. Initially set k equal" 

STEP 2. Find the maximum value^of M sensitivity 
for the pixels in the kth block, S k , 



S k = max(S kjl , S k , 2 / S k(3 , • • -S k , L ) 



;ion 2 
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• A- for the ith pixel in the kth 

where S ( , is the .^^f J^^at'on could be 
block. Alternatively, Vhe maximum p 
replaced by any other sitable evaluate of the 
sensitivities of a blocAsuch as the average of the 
sensitivities, ^ ^ ^ ^ , ^ tiKI of 

step size Q/S k . 

STEP 4 
Otherwise stop. 
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If k<N, \en let k=k + l and go to step 1 



DETAILS OF THIRD STRATEGY 
in .any system the total number of bits available 
for encoding a video frame i. oW ^^^t. 
user or the selecting the 

achieved as suggested by line 124 of FIG 13 



words, selecting the numbepo.f_bi^ 
aforementioned base Q likeKv^not^ 

bandwidth . / 
A model for the number 

image block is: 



results in the 
ling the available 

r.s invested in the ith 



Equation 3 



• ~ ^Tw/uaAtization scale, A 

is the number of pixels in na £^ er8 \ d escribed 

A= 16 2 pixels) , K and C are model \ of the 

■ ahandard de^a?afci® n or tne 

below) . a, is the empirical standard _ , ^ 

pixels in the block, and is defined as: 

30 C \ EaVation 4 
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with Pi(j) the value o\ the jth pixel in the ith block and 
Pi is the average of tMp pixel values in the block. Pi is 
defined as, 

Equation 5 
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For a color image, the Pi(\) 's are the values of the 

luminance and chrominance components for the block pixels. 

The model of Equation 3 was\derived using a rate-distortion 

analysis of the block's encoder and is discussed in greater 

detail in co-pending United States Patent Application Serial 

No. 09/008,137, filed January \l 6 j 1998, incorporated by 

reference herein. 

K and C are model parameters . K depends on the 

encoder efficiency and the distribution of the pixel values, 

and C is the number of bitsfor /encoding overhead 

information (e.g., motion vectdrsA syntax elements, etc.). 

\ / \ 

Preferably, the values of^K^a^d C are not known in advance 
and are estimated during encoding. 

The objective of the-^^ircA technique is to find 
the value of the quantization steeps ghat satisfy the 




following two conditions: 

(1) the total number^f^its produced for the 
image is a pre -^e'lSfctea^ target B; and 

(2) 



r 



the overall image qualipy is perceived as 
homogenous , cons t a^^T^or^uni form . 



Let N be the number of blocks in the 
condition in terms of the encoder 



As described in relation to the sec 
condition is satisfied by a set of quaffl: 





frame. The first 



Equation 6 



V^9Y/ the second 



Etruation 7 
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where (Q , /S k )=Q k is the (Quantization step of the kth block, 
but now Q' is not known. 

Combining Equations 6 and\7 the following equation is 
obtained: 



Equation 8 
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The following expression for Q' is obtained from Equation 8 



Equation 9 



Equation 9 is the basis for the preferred rate control 
technique, described belowL 

The quantizers fop encoding the N image blocks in 
a frame are preferably selected with the following 
technique . 

STEP 1. Initialization. Let i=l (first block), 
B X =B (available bits) , N X =N (number of blocks) . Let 



where o k and S k are defined in 
respectively. 

the encoder model are known^r^ 
using linear regression, let 



Equation 10 



equations 4 and 2, 
If the values of the parameters K and C in 
r estimated in advance, e.g., 
^i^and C X =C. If the model 
parameters are not known, set K x and C 2 to some small non- 



negative values, such as 1^ = 0^ 
estimates. In video coding, 
values K N+1 and C N+1 , respective 
frame, or any other suitable 

STEP 2 . The quant^j 
block is computed as follows: 



^andlCpO as initial 
jne cqmld set K x and C x to the 
^om the previous encoded 



tlue 



>n parameter for the ith 



Equation 11 
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If the values of the Q-paramete^s^are restricted to a fixed 
set (e.g., in H.263, Qi=2QP and QP^Jkak^s^ values in 
{1,2,3,... ,31,}, round Qi to the nea rest v^lue in the set. 
The square root operation can be iTj^renferrt^d using look-up 
tables 

STEP 3. The ith block is^fc^^d^Wth a block- 
based coder, such as the coder of FIG. Sjkt Bi 1 be the 



number of bits used to encode the ith (block 




mpute 
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Equation 12 



STEP 4. The parameters K i+1 and C i+1 of the coder 
model are updated. For the fixed mode K i+1 =K and C i+1 =C. For 



the adaptive mode, K i+1 ^nd C i+1 are determined using any- 
suitable technique for model fitting. For example, one 
could use the model fitning techniques in co-pending United 
States Patent Application, Serial No. 09/008,137, 
10 incorporated by references herein. 

STEP 5. If i=N,v L stop (all image blocks are 
encoded) . If not, i=i+l arid go to Step 2. 

ENCOEER SYSTEM 
Referring to FIG. 1^4, an encoder system 200 
15 includes an input image_sequeiice 2 02 which is passed to the 
encoder 100 and a |^-samp^e\Mock 204 which decomposes the 
image sequence 202 into^macrcN\)locks . The macro-blocks from 
the sub-sample block 204^ai|^paissed to the visual model 60 
and the face detection ^module 10. An optional gaze 
20 direction measurement blocrkL2^06\detects the location of the 
gaze of the viewer. The output orom either the measurement 
block 206 or detection module 10 1 As passed to the visual 



jt 

model 60 and optionally t© an encoffle gaze parameters block 
208. Calibration parameter^^dr the pixel size and/or 
2 5 viewing distance are provided to tlie^ visual model 60 and the 
encode gaze parameters block^08 by a calibration block 210. 
The visual model 60 provides^its sensitivity output to the 
encoder 100. The encoder 100 tKe?rea-fner transmits encoded 
data to a storage device or a d ecoder 
30 Referring to FIG. lS^thel decoder 300 decodes the 

gaze parameters with a decode(gaze parameters block 3 02. A 
visual model block 304 calculates^oJ^ eccentricity versus 
image location and sensitivity versus eccentricity. The 
visual model block 304 provides quantization parameters to 
35 the decode data block 3 06 which decodes thei encoded data 




30 

based on the quantization ^parameters . An inverse transform 

block 308 decomposes the data from the decode data block 306 

to obtain a decompressed image sequence 310 for use, such as 

being displayed on a display. 

5 The terms and expressions which have been employed 

\ 

in the foregoing specif ication ( are used therein as terms of 
description and not of limitafc olon # and there is no 
intention, in the use of such \t erms and expressions, of 
excluding equivalents of the features shown and described or 
10 portions thereof, it being recognized that the scope of the 
invention is defined arid limiyeiji only by the claims which 
follow. 




